Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Leuk Res ; 131: 107325, 2023 08.
Artigo em Inglês | MEDLINE | ID: mdl-37302352

RESUMO

Use of the potent tyrosine kinase inhibitor imatinib as the first-line treatment in chronic myeloid leukemia (CML) has decreased mortality from 20% to 2%. Approximately 30% of CML patients experience imatinib resistance, however, largely because of point mutations in the kinase domain of the BCR-ABL1 fusion gene. The aim of this study was to use next-generation sequencing (NGS) to identify mutations related to imatinib resistance. The study included 22 patients diagnosed with CML and experiencing no clinical response to imatinib. Total RNA was used for cDNA synthesis, with amplification of a fragment encompassing the BCR-ABL1 kinase domain using a nested-PCR approach. Sanger and NGS were applied to detect genetic alterations. HaplotypeCaller was used for variant calling, and STAR-Fusion software was applied for fusion breakpoint identification. After sequencing analysis, F311I, F317L, and E450K mutations were detected respectively in three different participants, and in another two patients, single nucleotide variants in BCR (rs9608100, rs140506, rs16802) and ABL1 (rs35011138) were detected. Eleven patients carried e14a2 transcripts, nine had e13a2 transcripts, and both transcripts were identified in one patient. One patient had co-expression of e14a2 and e14a8 transcripts. The results identify candidate single nucleotide variants and co-expressed BCR-ABL1 transcripts in cellular resistance to imatinib.


Assuntos
Proteínas de Fusão bcr-abl , Leucemia Mielogênica Crônica BCR-ABL Positiva , Humanos , Mesilato de Imatinib/uso terapêutico , Proteínas de Fusão bcr-abl/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/tratamento farmacológico , Leucemia Mielogênica Crônica BCR-ABL Positiva/genética , Leucemia Mielogênica Crônica BCR-ABL Positiva/diagnóstico , Mutação , Inibidores de Proteínas Quinases/uso terapêutico , Nucleotídeos/uso terapêutico , Resistencia a Medicamentos Antineoplásicos/genética
2.
Viruses ; 15(2)2023 01 28.
Artigo em Inglês | MEDLINE | ID: mdl-36851588

RESUMO

BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causing coronavirus disease 2019 (COVID-19) is the most transmissible ß-coronavirus in history, affecting all population groups. Immunocompromised patients, particularly cancer patients, have been highlighted as a reservoir to promote accumulation of viral mutations throughout persistent infection. CASE PRESENTATION: We aimed to describe the clinical course and SARS-CoV-2 mutation profile for 102 days in an immunocompromised patient with non-Hodgkin's lymphoma and COVID-19. We used RT-qPCR to quantify SARS-CoV-2 viral load over time and whole-virus genome sequencing to identify viral lineage and mutation profile. The patient presented with a persistent infection through 102 days while being treated with cytotoxic chemotherapy for non-Hodgkin's lymphoma and received targeted therapy for COVID-19 with remdesivir and hyperimmune plasma. All sequenced samples belonged to the BA.1.1 lineage. We detected nine amino acid substitutions in five viral genes (Nucleocapsid, ORF1a, ORF1b, ORF13a, and ORF9b), grouped in two clusters: the first cluster with amino acid substitutions only detected on days 39 and 87 of sample collection, and the second cluster with amino acid substitutions only detected on day 95 of sample collection. The Spike gene remained unchanged in all samples. Viral load was dynamic but consistent with the disease flares. CONCLUSIONS: This report shows that the multiple mutations that occur in an immunocompromised patient with persistent COVID-19 could provide information regarding viral evolution and emergence of new SARS-CoV-2 variants.


Assuntos
COVID-19 , Linfoma não Hodgkin , Humanos , SARS-CoV-2/genética , COVID-19/diagnóstico , Eliminação de Partículas Virais , Infecção Persistente , Linfoma não Hodgkin/complicações , Linfoma não Hodgkin/tratamento farmacológico , Hospedeiro Imunocomprometido
3.
Front Public Health ; 11: 1321283, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38419814

RESUMO

Background: Since its appearance, COVID-19 has immensely impacted our society. Public health measures, from the initial lockdowns to vaccination campaigns, have mitigated the crisis. However, SARS-CoV-2's persistence and evolving variants continue to pose global threats, increasing the risk of reinfections. Despite vaccination progress, understanding reinfections remains crucial for informed public health responses. Methods: We collected available data on clinical and genomic information for SARS-CoV-2 samples from patients treated in Mexico City from 2020 epidemiological week 10 to 2023 epidemiological week 06 encompassing the whole public health emergency's period. To identify clinical data we utilized the SISVER (Respiratory Disease Epidemiological Surveillance System) database for SARS-CoV-2 patients who received medical attention in Mexico City. For genomic surveillance we analyzed genomic data previously uploaded to GISAID generated by Mexican institutions. We used these data sources to generate descriptors of case number, hospitalization, death and reinfection rates, and viral variant prevalence throughout the pandemic period. Findings: The fraction of reinfected individuals in the COVID-19 infected population steadily increased as the pandemic progressed in Mexico City. Most reinfections occurred during the fifth wave (40%). This wave was characterized by the coexistence of multiple variants exceeding 80% prevalence; whereas all other waves showed a unique characteristic dominant variant (prevalence >95%). Shifts in symptom patient care type and severity were observed, 2.53% transitioned from hospitalized to ambulatory care type during reinfection and 0.597% showed the opposite behavior; also 7.23% showed a reduction in severity of symptoms and 6.05% displayed an increase in severity. Unvaccinated individuals accounted for the highest percentage of reinfections (41.6%), followed by vaccinated individuals (31.9%). Most reinfections occurred after the fourth wave, dominated by the Omicron variant; and after the vaccination campaign was already underway. Interpretation: Our analysis suggests reduced infection severity in reinfections, evident through shifts in symptom severity and care patterns. Unvaccinated individuals accounted for most reinfections. While our study centers on Mexico City, its findings may hold implications for broader regions, contributing insights into reinfection dynamics.


Assuntos
COVID-19 , Saúde Pública , Humanos , Reinfecção , COVID-19/epidemiologia , México/epidemiologia , Controle de Doenças Transmissíveis , SARS-CoV-2
4.
R Soc Open Sci ; 9(5): 220031, 2022 May.
Artigo em Inglês | MEDLINE | ID: mdl-35620002

RESUMO

Retinoblastoma (Rb) is a rare intraocular tumour in early childhood, with an approximate incidence of 1 in 18 000 live births. Experimental studies for Rb are complex due to the challenges associated with obtaining a normal retina to contrast with diseased tissue. In this work, we reanalyse a dataset that contains normal retina samples. We identified the individual genes whose expression is different in Rb in contrast with normal tissue, determined the pathways whose global expression pattern is more distant from the global expression observed in normal tissue, and finally, we identified which transcription factors regulate the highest number of differentially expressed genes (DEGs) and proposed as transcriptional master regulators (TMRs). The enrichment of DEGs in the phototransduction and retrograde endocannabinoid signalling pathways could be associated with abnormal behaviour of the processes leading to cellular differentiation and cellular proliferation. On the other hand, the TMRs nuclear receptor subfamily 5 group A member 2 and hepatocyte nuclear factor 4 gamma are involved in hepatocyte differentiation. Therefore, the enrichment of aberrant expression in these transcription factors could suggest an abnormal retina development that could be involved in Rb origin and progression.

5.
Microb Genom ; 8(5)2022 05.
Artigo em Inglês | MEDLINE | ID: mdl-35584008

RESUMO

Genomics has set the basis for a variety of methodologies that produce high-throughput datasets identifying the different players that define gene regulation, particularly regulation of transcription initiation and operon organization. These datasets are available in public repositories, such as the Gene Expression Omnibus, or ArrayExpress. However, accessing and navigating such a wealth of data is not straightforward. No resource currently exists that offers all available high and low-throughput data on transcriptional regulation in Escherichia coli K-12 to easily use both as whole datasets, or as individual interactions and regulatory elements. RegulonDB (https://regulondb.ccg.unam.mx) began gathering high-throughput dataset collections in 2009, starting with transcription start sites, then adding ChIP-seq and gSELEX in 2012, with up to 99 different experimental high-throughput datasets available in 2019. In this paper we present a radical upgrade to more than 2000 high-throughput datasets, processed to facilitate their comparison, introducing up-to-date collections of transcription termination sites, transcription units, as well as transcription factor binding interactions derived from ChIP-seq, ChIP-exo, gSELEX and DAP-seq experiments, besides expression profiles derived from RNA-seq experiments. For ChIP-seq experiments we offer both the data as presented by the authors, as well as data uniformly processed in-house, enhancing their comparability, as well as the traceability of the methods and reproducibility of the results. Furthermore, we have expanded the tools available for browsing and visualization across and within datasets. We include comparisons against previously existing knowledge in RegulonDB from classic experiments, a nucleotide-resolution genome viewer, and an interface that enables users to browse datasets by querying their metadata. A particular effort was made to automatically extract detailed experimental growth conditions by implementing an assisted curation strategy applying Natural language processing and machine learning. We provide summaries with the total number of interactions found in each experiment, as well as tools to identify common results among different experiments. This is a long-awaited resource to make use of such wealth of knowledge and advance our understanding of the biology of the model bacterium E. coli K-12.


Assuntos
Escherichia coli K12 , Escherichia coli , Escherichia coli/genética , Escherichia coli K12/genética , Escherichia coli K12/metabolismo , Regulação Bacteriana da Expressão Gênica , Óperon/genética , Reprodutibilidade dos Testes
6.
Sci Rep ; 12(1): 4759, 2022 03 19.
Artigo em Inglês | MEDLINE | ID: mdl-35306521

RESUMO

End-point RT-PCR is a suitable alternative diagnostic technique since it is cheaper than RT-qPCR tests and can be implemented on a massive scale in low- and middle-income countries. In this work, a bioinformatic approach to guide the design of PCR primers was developed, and an alternative diagnostic test based on end-point PCR was designed. End-point PCR primers were designed through conservation analysis based on kmer frequency in SARS-CoV-2 and human respiratory pathogen genomes. Highly conserved regions were identified for primer design, and the resulting PCR primers were used to amplify 871 nasopharyngeal human samples with a previous RT-qPCR based SARS-CoV-2 diagnosis. The diagnostic test showed high accuracy in identifying SARS-CoV-2-positive samples including B.1.1.7, P.1, B.1.427/B.1.429 and B.1.617.2/ AY samples with a detection limit of 7.2 viral copies/µL. In addition, this test could discern SARS-CoV-2 infection from other viral infections with COVID-19-like symptomatology. The designed end-point PCR diagnostic test to detect SARS-CoV-2 is a suitable alternative to RT-qPCR. Since the proposed bioinformatic approach can be easily applied in thousands of viral genomes and over highly divergent strains, it can be used as a PCR design tool as new SARS-CoV-2 variants emerge. Therefore, this end-point PCR test could be employed in epidemiological surveillance to detect new SARS-CoV-2 variants as they emerge and propagate.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/diagnóstico , COVID-19/epidemiologia , Teste para COVID-19 , Humanos , RNA Viral/análise , RNA Viral/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , SARS-CoV-2/genética
7.
Viruses ; 14(3)2022 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-35336952

RESUMO

Omicron is the most mutated SARS-CoV-2 variant-a factor that can affect transmissibility, disease severity, and immune evasiveness. Its genomic surveillance is important in cities with millions of inhabitants and an economic center, such as Mexico City. Results. From 16 November to 31 December 2021, we observed an increase of 88% in Omicron prevalence in Mexico City. We explored the R346K substitution, prevalent in 42% of Omicron variants, known to be associated with immune escape by monoclonal antibodies. In a phylogenetic analysis, we found several independent exchanges between Mexico and the world, and there was an event followed by local transmission that gave rise to most of the Omicron diversity in Mexico City. A haplotype analysis revealed that there was no association between haplotype and vaccination status. Among the 66% of patients who have been vaccinated, no reported comorbidities were associated with Omicron; the presence of odynophagia and the absence of dysgeusia were significant predictor symptoms for Omicron, and the RT-qPCR Ct values were lower for Omicron. Conclusions. Genomic surveillance is key to detecting the emergence and spread of SARS-CoV-2 variants in a timely manner, even weeks before the onset of an infection wave, and can inform public health decisions and detect the spread of any mutation that may affect therapeutic efficacy.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , Cidades/epidemiologia , Genômica , Humanos , México/epidemiologia , Filogenia , SARS-CoV-2/genética
8.
Viruses ; 14(3)2022 03 09.
Artigo em Inglês | MEDLINE | ID: mdl-35336968

RESUMO

The spread of the newly emerged severe acute respiratory syndrome (SARS-CoV-2) virus has led to more than 430 million confirmed cases, including more than 5.9 million deaths, reported worldwide as of 24 February 2022. Conservation of viral genomes is important for pathogen identification and diagnosis, therapeutics development and epidemiological surveillance to detect the emergence of new viral variants. An intense surveillance of virus variants has led to the identification of Variants of Interest and Variants of Concern. Although these classifications dynamically change as the pandemic evolves, they have been useful to guide public health efforts on containment and mitigation. In this work, we present CovDif, a tool to detect conserved regions between groups of viral genomes. CovDif creates a conservation landscape for each group of genomes of interest and a differential landscape able to highlight differences in the conservation level between groups. CovDif is able to identify loss in conservation due to point mutations, deletions, inversions and chromosomal rearrangements. In this work, we applied CovDif to SARS-CoV-2 clades (G, GH, GR, GV, L, O, S and G) and variants. We identified all regions for any defining SNPs. We also applied CovDif to a group of population genomes and evaluated the conservation of primer regions for current SARS-CoV-2 detection and diagnostic protocols. We found that some of these protocols should be applied with caution as few of the primer-template regions are no longer conserved in some SARS-CoV-2 variants. We conclude that CovDif is a tool that could be widely applied to study the conservation of any group of viral genomes as long as whole genomes exist.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/diagnóstico , Genoma Viral , Humanos , Mutação Puntual , SARS-CoV-2/genética
9.
Viruses ; 13(11)2021 10 29.
Artigo em Inglês | MEDLINE | ID: mdl-34834987

RESUMO

The SARS-CoV-2 pandemic is one of the most concerning health problems around the globe. We reported the emergence of SARS-CoV-2 variant B.1.1.519 in Mexico City. We reported the effective reproduction number (Rt) of B.1.1.519 and presented evidence of its geographical origin based on phylogenetic analysis. We also studied its evolution via haplotype analysis and identified the most recurrent haplotypes. Finally, we studied the clinical impact of B.1.1.519. The B.1.1.519 variant was predominant between November 2020 and May 2021, reaching 90% of all cases sequenced in February 2021. It is characterized by three amino acid changes in the spike protein: T478K, P681H, and T732A. Its Rt varies between 0.5 and 2.9. Its geographical origin remain to be investigated. Patients infected with variant B.1.1.519 showed a highly significant adjusted odds ratio (aOR) increase of 1.85 over non-B.1.1.519 patients for developing a severe/critical outcome (p = 0.000296, 1.33-2.6 95% CI) and a 2.35-fold increase for hospitalization (p = 0.005, 1.32-4.34 95% CI). The continuous monitoring of this and other variants will be required to control the ongoing pandemic as it evolves.


Assuntos
COVID-19/epidemiologia , COVID-19/virologia , SARS-CoV-2/genética , Glicoproteína da Espícula de Coronavírus/genética , Número Básico de Reprodução/estatística & dados numéricos , Evolução Biológica , Genoma Viral , Haplótipos , Humanos , México/epidemiologia , Mutação , Nasofaringe/virologia , Filogenia , RNA Viral , SARS-CoV-2/classificação
10.
Rev Invest Clin ; 73(6): 339-346, 2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34292929

RESUMO

BACKGROUND: The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic is a current public health concern. Rapid diagnosis is crucial, and reverse transcription polymerase chain reaction (RT-PCR) is presently the reference standard for SARS-CoV-2 detection. OBJECTIVE: Automated RT-PCR analysis (ARPA) is a software designed to analyze RT-PCR data for SARSCoV-2 detection. ARPA loads the RT-PCR data, classifies each sample by assessing its amplification curve behavior, evaluates the experiment's quality, and generates reports. METHODS: ARPA was implemented in the R language and deployed as a Shiny application. We evaluated the performance of ARPA in 140 samples. The samples were manually classified and automatically analyzed using ARPA. RESULTS: ARPA had a true-positive rate = 1, true-negative rate = 0.98, positive-predictive value = 0.95, and negative-predictive value = 1, with 36 samples correctly classified as positive, 100 samples correctly classified as negative, and two samples classified as positive even when labeled as negative by manual inspection. Two samples were labeled as invalid by ARPA and were not considered in the performance metrics calculation. CONCLUSIONS: ARPA is a sensitive and specific software that facilitates the analysis of RT-PCR data, and its implementation can reduce the time required in the diagnostic pipeline.


Assuntos
COVID-19/diagnóstico , Diagnóstico por Computador , SARS-CoV-2/isolamento & purificação , Software , Teste para COVID-19 , Humanos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Saliva/virologia
11.
Front Oncol ; 10: 572954, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33194675

RESUMO

Studies have suggested a potential role of somatic mitochondrial mutations in cancer development. To analyze the landscape of somatic mitochondrial mutation in breast cancer and to determine whether mitochondrial DNA (mtDNA) mutational burden is correlated with overall survival (OS), we sequenced whole mtDNA from 92 matched-paired primary breast tumors and peripheral blood. A total of 324 germline variants and 173 somatic mutations were found in the tumors. The most common germline allele was 663G (12S), showing lower heteroplasmy levels in peripheral blood lymphocytes than in their matched tumors, even reaching homoplasmic status in several cases. The heteroplasmy load was higher in tumors than in their paired normal tissues. Somatic mtDNA mutations were found in 73.9% of breast tumors; 59% of these mutations were located in the coding region (66.7% non-synonymous and 33.3% synonymous). Although the CO1 gene presented the highest number of mutations, tRNA genes (T,C, and W), rRNA 12S, and CO1 and ATP6 exhibited the highest mutation rates. No specific mtDNA mutational profile was associated with molecular subtypes of breast cancer, and we found no correlation between mtDNA mutational burden and OS. Future investigations will provide insight into the molecular mechanisms through which mtDNA mutations and heteroplasmy shifting contribute to breast cancer development.

12.
Front Physiol ; 11: 588012, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33391012

RESUMO

Metabolism is loosely defined as the set of physical and chemical interactions associated with the processes responsible for sustaining life. Two evident features arise whenever one looks at metabolism: first, metabolism is conformed as a very complex and intertwined construct of the many associated biomolecular processes. Second, metabolism is characterized by a high degree of stability reflected by the organisms resilience to either environmental changes or pathogenic conditions. Here we will investigate the relationship between these two features. By having access to the full set of human metabolic interactions as reported in the highly curated KEGG database, we built an integrated human metabolic network comprising metabolic, transcriptional regulation, and protein-protein interaction networks. We hypothesized that a metabolic process may exhibit resilience if it can recover from perturbations at the pathway level; in other words, metabolic resilience could be due to pathway crosstalk which may implicate that a metabolic process could proceed even when a perturbation has occurred. By analyzing the topological structure of the integrated network, as well as the hierarchical structure of its main modules or subnetworks, we observed that behind biological resilience lies an intricate communication structure at the topological and functional level with pathway crosstalk as the main component. The present findings, alongside the advent of large biomolecular databases, such as KEGG may allow the study of the consequences of this redundancy and resilience for the study of healthy and pathological phenotypes with many potential applications in biomedical science.

13.
Nucleic Acids Res ; 47(D1): D212-D220, 2019 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-30395280

RESUMO

RegulonDB, first published 20 years ago, is a comprehensive electronic resource about regulation of transcription initiation of Escherichia coli K-12 with decades of knowledge from classic molecular biology experiments, and recently also from high-throughput genomic methodologies. We curated the literature to keep RegulonDB up to date, and initiated curation of ChIP and gSELEX experiments. We estimate that current knowledge describes between 10% and 30% of the expected total number of transcription factor- gene regulatory interactions in E. coli. RegulonDB provides datasets for interactions for which there is no evidence that they affect expression, as well as expression datasets. We developed a proof of concept pipeline to merge binding and expression evidence to identify regulatory interactions. These datasets can be visualized in the RegulonDB JBrowse. We developed the Microbial Conditions Ontology with a controlled vocabulary for the minimal properties to reproduce an experiment, which contributes to integrate data from high throughput and classic literature. At a higher level of integration, we report Genetic Sensory-Response Units for 200 transcription factors, including their regulation at the metabolic level, and include summaries for 70 of them. Finally, we summarize our research with Natural language processing strategies to enhance our biocuration work.


Assuntos
Biologia Computacional/métodos , Escherichia coli K12/genética , Regulação Bacteriana da Expressão Gênica , Genômica , Ontologia Genética , Redes Reguladoras de Genes , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala
15.
Proc Natl Acad Sci U S A ; 115(21): 5516-5521, 2018 05 22.
Artigo em Inglês | MEDLINE | ID: mdl-29735690

RESUMO

The precise determination of de novo genetic variants has enormous implications across different fields of biology and medicine, particularly personalized medicine. Currently, de novo variations are identified by mapping sample reads from a parent-offspring trio to a reference genome, allowing for a certain degree of differences. While widely used, this approach often introduces false-positive (FP) results due to misaligned reads and mischaracterized sequencing errors. In a previous study, we developed an alternative approach to accurately identify single nucleotide variants (SNVs) using only perfect matches. However, this approach could be applied only to haploid regions of the genome and was computationally intensive. In this study, we present a unique approach, coverage-based single nucleotide variant identification (COBASI), which allows the exploration of the entire genome using second-generation short sequence reads without extensive computing requirements. COBASI identifies SNVs using changes in coverage of exactly matching unique substrings, and is particularly suited for pinpointing de novo SNVs. Unlike other approaches that require population frequencies across hundreds of samples to filter out any methodological biases, COBASI can be applied to detect de novo SNVs within isolated families. We demonstrate this capability through extensive simulation studies and by studying a parent-offspring trio we sequenced using short reads. Experimental validation of all 58 candidate de novo SNVs and a selection of non-de novo SNVs found in the trio confirmed zero FP calls. COBASI is available as open source at https://github.com/Laura-Gomez/COBASI for any researcher to use.


Assuntos
Variações do Número de Cópias de DNA , Genoma Humano , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Pais , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA/métodos , Software , Algoritmos , Criança , Humanos
17.
Genetics ; 208(4): 1631-1641, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29367403

RESUMO

We present a conceptually simple, sensitive, precise, and essentially nonstatistical solution for the analysis of genome variation in haploid organisms. The generation of a Perfect Match Genomic Landscape (PMGL), which computes intergenome identity with single nucleotide resolution, reveals signatures of variation wherever a query genome differs from a reference genome. Such signatures encode the precise location of different types of variants, including single nucleotide variants, deletions, insertions, and amplifications, effectively introducing the concept of a general signature of variation. The precise nature of variants is then resolved through the generation of targeted alignments between specific sets of sequence reads and known regions of the reference genome. Thus, the perfect match logic decouples the identification of the location of variants from the characterization of their nature, providing a unified framework for the detection of genome variation. We assessed the performance of the PMGL strategy via simulation experiments. We determined the variation profiles of natural genomes and of a synthetic chromosome, both in the context of haploid yeast strains. Our approach uncovered variants that have previously escaped detection. Moreover, our strategy is ideally suited for further refining high-quality reference genomes. The source codes for the automated PMGL pipeline have been deposited in a public repository.


Assuntos
Variação Genética , Genoma , Genômica , Haploidia , Cromossomos , Biologia Computacional , Simulação por Computador , Testes Genéticos , Genoma Fúngico , Genoma Humano , Estudo de Associação Genômica Ampla , Genômica/métodos , Humanos , Polimorfismo de Nucleotídeo Único , Sequenciamento Completo do Genoma , Leveduras/genética
18.
Proc Natl Acad Sci U S A ; 108(37): 15294-9, 2011 Sep 13.
Artigo em Inglês | MEDLINE | ID: mdl-21876154

RESUMO

We have entered the era of individual genomic sequencing, and can already see exponential progress in the field. It is of utmost importance to exclude false-positive variants from reported datasets. However, because of the nature of the used algorithms, this task has not been optimized to the required level of precision. This study presents a unique strategy for identifying SNPs, called COIN-VGH, that largely minimizes the presence of false-positives in the generated data. The algorithm was developed using the X-chromosome-specific regions from the previously sequenced genomes of Craig Venter and James Watson. The algorithm is based on the concept that a nucleotide can be individualized if it is analyzed in the context of its surrounding genomic sequence. COIN-VGH consists of defining the most comprehensive set of nucleotide strings of a defined length that map with 100% identity to a unique position within the human reference genome (HRG). Such set is used to retrieve sequence reads from a query genome (QG), allowing the production of a genomic landscape that represents a draft HRG-guided assembly of the QG. This landscape is analyzed for specific signatures that indicate the presence of SNPs. The fidelity of the variation signature was assessed using simulation experiments by virtually altering the HRG at defined positions. Finally, the signature regions identified in the HRG and in the QG reads are aligned and the precise nature and position of the corresponding SNPs are detected. The advantages of COIN-VGH over previous algorithms are discussed.


Assuntos
Simulação por Computador , Genoma Humano/genética , Hibridização de Ácido Nucleico/métodos , Nucleotídeos/genética , Polimorfismo de Nucleotídeo Único/genética , Cromossomos Humanos X/genética , Sondas de DNA/metabolismo , Humanos , Padrões de Referência
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...